On the Boundedness of the Support of Optimal 
Input Measures for Rayleigh Fading Channels 
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Abstract — We consider transmission over a wireless multiple 
antenna communication system operating in a Rayleigh flat 
fading environment with no channel state information at the 
receiver and the transmitter with coherence time T = 1. We 
show that, subject to the average power constraint, the support 
of the capacity achieving input distribution is bounded. Moreover, 
we show by a simple example concerning the identity theorem 
(or uniqueness theorem) from the complex analysis in several 
variables that some of the existing results in the field are not 
rigorous. 

I. Introduction 

We show in this paper by elementary means that the support 
of the capacity achieving input measure for multiple-input 
multiple-output (MIMO) Rayleigh fading channels subject to 
average power constraint with coherence time T = 1 is 
bounded. A generalization of the result to coherence intervals 
of size T > 1 seems to be highly non-trivial and will probably 
require a substantial extension of the techniques used here 
supplemented by some results and methods from the "hard 
analysis". 

Previous fundamental achievements, e.g. [1], [3], [5], [6], 
follow the same procedure which can be traced back to the 
classic paper [8] by Smith. The basic tools are the Karush- 
Kuhn-Tucker (KKT) conditions from the theory of convex 
optimization supported by an application of the identity the- 
orem (also known as the uniqueness theorem) from complex 
analysis. Our approach is based on the KKT conditions too 
but avoids the usage of the identity theorem. 
In [1] Abou-Faycal, Trott, and Shamai proved, using these 
techniques, that for a one-dimensional Rayleigh fading channel 
the optimal input measure subjected to an average power 
constraint to be discrete with a finite number of mass points. 
In [3] Chan, Hranilovic, and Kschischang showed for a MIMO 
Rayleigh block-fading channel with i.i.d. channel matrix coef- 
ficients that the optimum input distribution subjected to peak 
and average power constraint contains a finite number of mass 
points with respect to a specific norm. In addition Fozunbal, 
Mclaughlin, and Schafer argued in [5] that a bounded support 
of the capacity maximizer implies its singularity with respect 



to the Borel-Lebesgue measure. The approach in [5], [3] is 
based on the identity theorem for holomorphic functions in 
several complex variables and use the assumption that an 
open set in K™ fulfills the hypothesis of the identity theorem 
in C™. We show in section [IV] by a simple example that 
the conclusion of the identity theorem fails in this setting. 
Consequently, these results are not rigorous. Since, in contrast 
to the complex analysis in one variable, it is still an open 
difficult problem to characterize the families of sets for which 
the identity theorem for holomorphic functions in several 
complex variables holds we cannot hope to understand the 
properties of the capacity maximizers in the present setting 
by an reduction to uniqueness properties of holomorphic 
functions in higher dimensions. Therefore, it is likely that we 
will be forced to develop or apply "real-analytic" tools for 
tackling this important communication-theoretic problem. 
The paper is organized as follows: Section iHl provides some 
basic definitions and is followed by Section fTITl which contains 
the main result of this paper. As mentioned above, in Section 
HVl we give an elementary example that shows that the ap- 
plication of the identity theorem in higher dimensions is, in 
general, not admissible if we want to understand the properties 
of capacity maximizers of Rayleigh fading channels. 
Notation. Throughout the paper we will denote the set of 
complex A^-by-l matrices by M(N x 1,C) and will freely 
identify this set with C . In stands for the logarithm to 
the base e. Capital letters X, Y, H are reserved for random 
variables. 

II. Rayleigh fading channel 

We consider a Rayleigh fading channel with the coherence 
time T — 1 which is described by 



N 

Y = s TH 

± m / ±L n 



(1) 



with coefficient matrices Y, Z E M(M x 1, C), 

X G M(N x 1,C) and H G M(M x N,C), where the the 

channel H is assumed to be complex circularly symmetric 



Gaussian with zero mean and with covariance matrix E and 
the additive noise coefficients Z m are assumed to be i.i.d. 
complex circularly symmetric Gaussian with CA/"(0, a z ). Let 
V(X) be the set of probability measures on 
(M(N x l,C),E Bore ;(M(Ar x 1,C))). Then the set 



H g , a {X) = {fieT\ J (g(x) - a)dti(x) < 0} (2) 
with the average power constraint of the transmitted signal 

1 ^ 

/ {g(x) - a)dfi(x) = / — ^ \x,i\ 2 dfi(x) - a < (3) 

is weak* compact as it was shown in [5] and [4]. 
If V(Y) is the set of conditional probability measures on 
(M(M x l,C),E Borei (M(M x 1,C))) we can determine the 
channel by a set {W(-|x) G V(Y)\x G M(N x 1,C)}, 
where W(-|a;) is absolutely continuous with respect to Borel- 
Lebesgue measure. For the Rayleigh fading channel the condi- 
tional probability density of the received signals y conditioned 
on the input symbol x is given by 

e -M[{a 2 z \ M + (^M®x H )T,(\ M ®x)y 1 yy H ] 

P(y|x) = 7r M dct(a 2 z l M + (1m ® i ff )S(l M i x)) (4) 

with covariance matrix E of H 

E = E(if ® if*). (5) 
Let jj, G ix g ^ a {X) be a probability measure and define 



(6) 



Then the mutual information of the channel with no CSI at 
the receiver is given by 

i(/i; WO = / p(y\x) log P^-dydfi(x). (7) 



The mutual Information is a weak* continuous functional on 
the weak* compact and convex set (j, g<a (X) (see [6]). Thus 
the functional W) achieves its maximum on fi gta (X) by 
the following 

Theorem 2.1 (Cf. [1]): Let / be a weak* continuous real- 
valued functional on a weak* compact subset S of X*. Then 
/ is bounded on S and achieves its maximum on S. 
The mutual information is strictly concave functional on 
Hg,a{X) up to equivalence of measures. Hereby, two measures 
At,^ G (Jt g , a (X) are called equivalent if f^(y) = f u (y). 
So its maximum on n g ^ a {X) is achieved by a unique input 
distribution up to equivalence defined above [6], Hence, with 



C(a) = sup IQi; W) 



(8) 



there exists a measure /io G (i g: a{X) that achieves the capacity 
of the channel and is unique up to equivalence of measures. 
The aim of this paper is to show that subjected to an average 
power constraint the capacity achieving distribution of the 
channel has an bounded support. 



III. Bounded support of optimal input distribution 

The purpose of this section is to show that the support of 
the capacity achieving input measure for the channel given in 
(HJi, with coherence time T = 1, is bounded. 
For r\ , T2 eK with < r\ < r-i we set 

B(rx,r 2 ) ■■= {x E M{N x 1,C) : r x < \x{xx H ) < r 2 }, (9) 
with (x, x) := \x{xx H ) = \\x\\ 2 . 

Lemma 3.1: Let ri,r 2 G M with < r\ < r 2 and 
n(B{ri,r 2 )) > with [i G [i g ^ a {X) be given. Then 



p(y\x) log f^(y)dy 

n{B(r u r 2 )) M(a 2 z + X mm x H x) 
> lo S TJl , x 1 - ^ ( 10 ) 



7r M n 



(<t| + XmaxTx) 



with n := max xeB ( riiT . 2 ) det(a|ljv/ + (1m ®x h )Y,(\ m ®x)) 
and X m in > and X max > are the minimum and maximum 
eigenvalues of the covariance matrix E. 

Proof: By the defining relation © we have 

U(y) : = / p(y\ x )v( dx ) > / p{y\x)K dx ) (H) 

J JB( ri ,r 2 ) 

Next we define 



n := max det(cr|l Af + (1 M ® x H )E(l M ® a;)), (12) 

xGB(r 1 ,r 2 ) 

whereas the maximum of the function is achieved on B(ri, r 2 ) 
because of the compactness of B(ri,r 2 ). Hence, 
for x G B(ri,r 2 ) we obtain 

-tr[{cr%lM + (lM®x H )£(lM®x))- 1 yy H ] 

P (y\x) > ^ . (13) 

For every x G M(M x 1, C) we have 

(a z + X m i n x H x)1m 

< (ct|1 m + (1m ® 2^)E(1m ® x)) 

max x H x)l M (14) 



where A„ 



> and A r 



> are the minimum and 



maximum eigenvalues of the hermitian and strictly positive 
covariance matrix E. By the definition of B(ri,r 2 ) we have 

n < ti{xx H ) = x H x = \\x\\ 2 (x G B(n,r 2 )). (15) 

Hence, it follows that 

a \ + ^minX H X > (T^ + A m j n ri. (16) 

For two operators A, B G M(iV, C) with A < B and a positive 
operator i? G M(7Y, C) we have 



tx(AR) < tr(BR). 



(17) 



Due to the fact that the operators in (fT4l > are hermitian and 
positive and the same holds for yy H and because the function 



f(A) = — A 1 is operator monotone for all positive operators 
[2], we have 



tr[((cr| + A mm ri)lM) _1 yy ff ] > 

tr [((erf + X min x H x)l M )~ l yy H ] 
> tr [{<j 2 z 1 m + (1m ® /)S(1m ® a;))" 1 ^] 
With (JT3J it follows that for x G -B(ri,r 2 ) 

e -tr[((<7|+A min r 1 )l M )- 1 yy H ] 

> ^ 

Inserting this into (TTTT i yields 

f^ y ) > ^gipl) e -f[((»S+U,i)i«)- 1 w fl ] , 
Therewith we get 

p{y\x) log fn(y)dy > 

p(y\x) log 

log A 

\\y 



(18) 



(19) 



(20) 



n(B( ri) r a )) -tt[(( ff |+A mi „r 1 )i M )-'y/] 

7r M n 



= logA- 
= logA 
> log A 
= logA 

with A 



tr + A m i„ri)l M ) ^y^] p{y\x)dy 

2 



tr(g|l M + (1m g ^ g )S(lM g a:)) 
(c| + A mm ri) 

g((g| + X max X H x)\ M ) 

M{a 2 z + \ max \\x\\ 2 ) 
KB{ri,r 2 )) 

7r M n ■ 



(21) 



Determining the capacity achieving input distribution sub- 
jected to average power constraint is a convex optimization 
problem. Necessary conditions for the optimal input distri- 
bution can be derived from the local Karush-Kuhn-Tucker 
conditions. Together with the fact that the mutual information 
is a concave functional and the convexity of the constraint 
functional we obtain (see [7] and [6]), that /i achieves capacity 
if and only if 



7(^IW! 2 



C{a) 



p{y\ x )\og^^-dy>0 (22) 



with equality if x G supp(/i), where 7 = 7(a) > denotes 
the Lagrange multiplier and 



/ \x n \ 2 d^(x) < a 



is the constraint under consideration. It is fairly standard fact 
that 



p(y\x)logp(y\x)dy = 

- log [(ne) M det(4l M + (1m ® x H )T,(l M ® a;))] 



and d22l i can be therefore rewritten as 



j(^\\x\\ 2 -a) + C(a)+log(ne) M + 
+ logdet(a|l M + (1 M ® x H )E(1 M ® a))- 
p(y|a;)log/^(j/)dy > 



(23) 



with equality if x G supp(/x). Let 



1 



KKT(x) := 7(^||x|r - a) + C(a) + log(Tre) 
+ logdet(cr|l M + (1m ® a; H )S(l M ® x))+ 
pCj/l^log/^^dj/ 



M 



(24) 



Then ( f22b can be rephrased as KKT(x) > for x G 
M(JV x 1, C) and KKT(x) = if x £ supp(^). 
The following theorem gives a sufficient condition for the 
boundedness of the support of the capacity achieving measure 
in terms of the Lagrange multiplier 7. 

Lemma 3.2: Let a G R+ be given and let [i be a capacity 
achieving input measure subject to the average power con- 
straint a for the channel (0]). Then 7(a) > implies that 
supp(/i) is bounded. 

Proof: The proof is by contradiction. Suppose that 
7(a) = 7 > and that supp(/x) is not bounded. By our 
assumptions we can find n , r% G M with the following 
properties: 



7 



fi(B(n,r 2 )) >0 

MNXm.nr. 

> 0. 



erf + A I; 



(25) 
(26) 



Applying Lemma IXTl to the function KKT(x) defined in (l24b 
we obtain the following inequality. 

KKT(x) > 7(-^N| 2 - a) + C(a) + log(^e) M + 

+ logdet(o-|l M + (1 M ®x H )Y,(l M ®x))+ 
M{a 2 +X max \\x\\ 2 ) 



logA 



Al \ maa 



)-~fa + C(a)+log(ire) M + 



logdet(a|lM + (1 m <Z>x h )Y;(1m®x)) 



logA 



Ma 2 



cr| + X r , 



r''i 



(27) 



Combining the Karush-Kuhn-Tucker conditions and d27l i we 
obtain that for any x G supp(/x) 

= KKT(x) > 



\x\\ 2 (^- - 

K N a 2 z + \ min r x 



MX r< 



)-ja + C{a) + log(7re) A/ + 



logdet(cr|l M + (1m ® x H )Y>(l M ® x))+ 
logA - 



Ma 2 z 



(28) 



But this last inequality with our assumption that supp(/i) is 
not bounded, d26l ). and the fact that 

II ||2/1 M\ max 



'AT 



^7 "I - A m j n 7*i 



and 

logdet(cr|ljif + (1m (8> x )£(1m ® a;)) — * oo as x — > oo 

implies that > oo, which is the desired contradiction. ■ 
In view of Lemma 13.21 our remaining goal is to show that 
7(a) > for each a 6 W.+ . For example in [1] Abou-Faycal, 
Trott, and Shamai showed this in the scalar case. Our proof 
of the corresponding result in MIMO case below is strongly 
motivated by their approach via Fano's inequality. 

Lemma 3.3: For the channel given in (|4]l we have 7(a) > 
for each a £ K + . 

Proof: As mentioned above the proof is an extension 
of the argument given in [1]. The capacity functional C(-) is 
a non-decreasing and concave function of the argument a 6 
R+. It was observed in [1] using global Karush-Kuhn-Tucker 
conditions that 7(a) is the slope of the tangent line to C(-) at 
a (cf. [1], Section III.B and Appendix II. A). Thus, since C(a) 
is non-decreasing and concave, it can be shown that 7(a) = 
implies C(a') = C{a) for all a' > cfl Consequently, we can 
rule out the possibility that 7(a) = by showing the existence 
of a sequence of input measures such that the corresponding 
sequence of mutual informations approaches 00. 
We will be done if there is A > such that for each n £ N 
we can find distinct x\ = xi(n), . . . ,x n — x n (n) € C N and 
disjoint measurable sets B\ = Bi(n),...,B n = B n (n) C 
< rM such that 

p(y\xi)dy > X 



for all i — 1, . . . , n. Because a simple application of Fano's 
inequality with block length 1 shows then that for the input 
measures fi n := — ^™=i ^« ($xt * s me P omt measure con- 
centrated on Xi) we have 



Now we define 

A 



/On, WO > A log 7 



1 W2mA„ 



1. 



> 



where u)2M denotes the surface area of the unit sphere in 



•"M 



i>2M 



eigenvalues of £ and let n 6 N be given. 

We will now present the construction of the vectors xi = 

Xi(n), . . . , x n = x n (n) S C N and the decoding sets B\ = 

Bi(n), ...,B n = B n (n). Let x e C N with ||x|| = 1 be fixed 

and consider a large positive real number K = K(n) > 1 that 

will be specified later. Set Xj := KiX for i = 1, . . . , n where 

K i: =K 2i . 

'This implication is not obvious since C(-) need not be differentiable. 
However, C(-) is differentiable a.e. due to the monotonicity and concavity. 
The proof that C(a) = C(a') for all a' > a follows a standard line of 
reasoning from the real analysis and is skipped due to the space limitation. 
The full argument will be given elsewhere. 



Let Xmin denote the smallest eigenvalue of E. For i = 1, . . . , n 
we set 

n = n{K) := + XminKi (29) 
and Bi := D(ri,r l+1 ) where 

D{n,r l+1 ) = {y G C M : n < Xi{yy H ) = (y,y) < r l+1 }. 
As shown in the proof of Lemma 13.11 we have 



p{y\x) > 



e z n 



tt m det(a 2 z + X max \\x\\n 



(30) 



M J 



Using d30b and transforming to spherical coordinates in C M 
M 2M we obtain 



Bi n ( a Z + A max J ^i ) 



e -o 4 r' r 2Af-l 



dr, (31) 



where uj 2 m denotes the surface area of the unit sphere in C M 
and dj = cii(K) := ^ 2 +A ^. K i - After the substitution t = 
air 2 in the integral on the RHS of the inequality d3Tl > we 
arrive at 

p(y\x)dv > ^(4+A mm ^ 2 )^ 
B mx t )dy > 2n M {a 2 z + XmaxK?) M 

2 

e-H M ~ x dt. (32) 



In what follows we use the abbreviation 



F{Ki 



<JJ2M{(J Z + X m i n K i ) 



2\M 



(33) 



The defining relation d29l and our assumption that K > 1 
ensure that a,r| > 1. Using this and ( l32t we are led to 



P {y\xi)dy > F(Ki) 



> F{Ki 



e-H M - x dt 



and Xmin, X m ax are the smallest and the largest t ^ at 



= F(K t ){e- a ^ -e- a > r *+i), (34) 

for all i — 1, . . . ,71. Now, since i^i = if 2 , aj = ai(K) := 
^+X~k? ' and ri = ri (^) + XminKi it is clear 



J' 4 + 1 



cxd as K — > cxd, 



2 ^min 

" — — ^ , as a -*oo 



and from ( f33b we have 

F[K t ) - 



cJ2AfA r . 



as iiT — > 00 



for all i — 1, . . . , n. Thus if we choose our K sufficiently 
large ( f34b and these limit relations ensure that 



Bi 



/ 1 n , - 1 ^2AfA m j n _ 

p{y\Xi)dy > — e 



= A > 0, 



for all i — 1, . . . , n. Moreover it is clear that the sequence 
of second moments of the measures p n = — can 
be made arbitrarily large for large K(n). This concludes our 
proof by the remarks given at the beginning of the argument. 

■ 

Now, we can summarize our results obtained so far in the 
following fashion: 

Theorem 3.4: We consider the channel defined by (@). 
Then the support of the capacity achieving input measure is 
bounded. 

Proof: Simply apply Lemma 13.31 and Lemma 13.21 ■ 

IV. Discussion 

With the embedding function £ : -» R 2N G C 2N with 
Zi = Re(xj) and Zi+\ = lm(xi) and the transformed channel 
we get an extension of the function 

KKT(x) : M(N x 1,C) -> K 
to 

KKT(z) : M(2N xl,C)->C 
where 

KKT(z) := 7 (lz T z - a) + C(a) - J p{y\z) log J^dy. 

(35) 

p and y G M(2M x 1, R) are obtained by changing the channel 
matrix and the channel output according the transformation 
of the input under £ (in [3] p. 2081, [5]). Moreover it 
is easily seen using Fubini's theorem from measure theory 
and Morera's theorem from the complex analysis in several 
variables (cf. [9]) that this extension of the function KKT is 
holomorphic. But, unfortunately, it is not true that the identity 
theorem (also known as the uniqueness theorem) holds for 
open sets in R 2JV as the following standard example shows: 
Example. We consider the simplest non-trivial case C 2 . Let 
{ei, e 2 } denote the standard basis of C 2 and let / : C 2 — > C 
be defined as 

f(z) := z T e 2 — z\ ■ + z 2 ■ 1 = z 2 

where T denotes the transpose and z\ , z 2 are the coordinates 
of z € C 2 with respect to the basis {ei : e 2 }. Clearly, / is 
holomorphic and the set of zeros of / is 

Af(/) = {C- ei }~K 2 . 

In what follows we identify N{f) with R 2 . R 2 is, by defini- 
tion, open in the natural topology on R 2 (but it is not open in 
the natural topology of C 2 , it is a closed linear subspace of 
C 2 ), and the function / is, apparently, not identically zero on 
C 2 . 

Note that this example with the identical arguments shows 
also that the conclusion of the identity theorem is not valid 
for open balls, say, in K 2 C C 2 . If B C R 2 C C 2 is any open 
ball in R 2 then f(z) = for all z G B but, again, / ^ on 
C 2 . The reason is, as before, that an open ball in M 2 (with the 
natural topology of R 2 ) is not open in the topology of C 2 . 
This last example shows that the proof of Proposition 4.3 in 



[5] is not correct, since it assumes the validity of the identity 
theorem in exactly this setting. It is this Proposition 4.3 in 
[5] which would allow us to conclude that the support of the 
capacity achieving input measure contains no open sets (in 
~ R 2Ar ) provided we know that this support is bounded. 
Actually, the authors of this paper are convinced that we need 
different mathematical techniques to tackle the problem of 
characterization of the optimal inputs for multiple antenna 
Rayleigh fading systems not relying on the identity theorem. 
One reason for this opinion is the fact that the characterization 
of sets for which the identity theorem holds (so called sets of 
uniqueness) in the setting of several complex variables is a 
long standing challenging open problem in complex analysis. 

V. Conclusions and future work 

We have shown that for a Rayleigh fading channel with 
coherence time T = 1 the support of the capacity achieving 
input measure is bounded. Our method of proof does not 
allow to extend the results to the case T > 1. In fact the 
techniques we have used have to be substantially sharpened 
and supplemented by additional new tools. Furthermore we 
have shown that the approach based on the application of the 
identity theorem from the complex analysis in several variables 
is not admissible. Therefore, it seems highly likely for us that 
the techniques needed should be "real-analytic" in spirit. 
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